Base Noun Phrase Translation Using Web Data and the EM Algorithm
نویسندگان
چکیده
We consider here the problem of Base Noun Phrase translation. We propose a new method to perform the task. For a given Base NP, we first search its translation candidates from the web. We next determine the possible translation(s) from among the candidates using one of the two methods that we have developed. In one method, we employ an ensemble of Naïve Bayesian Classifiers constructed with the EM Algorithm. In the other method, we use TF-IDF vectors also constructed with the EM Algorithm. Experimental results indicate that the coverage and accuracy of our method are significantly better than those of the baseline methods relying on existing technologies.
منابع مشابه
Mining Interesting Aspects of a Product using Aspect-based Opinion Mining from Product Reviews (RESEARCH NOTE)
As the internet and its applications are growing, E-commerce has become one of its rapid applications. Customers of E-commerce were provided with the opportunity to express their opinion about the product on the web as a text in the form of reviews. In the previous studies, mere founding sentiment from reviews was not helpful to get the exact opinion of the review. In this paper, we have used A...
متن کاملA Method of Cross-Lingual Question-Answering Based on Machine Translation and Noun Phrase Translation using Web documents - Yokohama National University at NTCIR-6 CLQA
We propose a method of English-Japanese cross lingual question-answering (E-J CLQA) that uses machine translation (MT) and an existing Japanese QA system. We also introduce noun phrase translation using Web documents in order to compensate the insufficiencies in the bilingual dictionary of the MT system. We combine several phrase translation techniques including 1) phrase translation using Wiki...
متن کاملInvestigating Embedded Question Reuse in Question Answering
The investigation presented in this paper is a novel method in question answering (QA) that enables a QA system to gain performance through reuse of information in the answer to one question to answer another related question. Our analysis shows that a pair of question in a general open domain QA can have embedding relation through their mentions of noun phrase expressions. We present methods f...
متن کاملDeterminers and Number in English contrasted with Japanese, as exemplified in Machine Translation
The fact that concepts are grammaticalized differently in different languages is a major problem for translation, especially for machine translation. Two major examples of this are syntactic number, and the use of (in)definite articles (a, some, the). In languages such as English, nouns are marked for number and the choice of article (or of no article) must be made for every noun phrase. In con...
متن کاملThe Role of Lexicalization and Pruning for Base Noun Phrase Grammars
This paper explores the role of lexicalization and pruning of grammars for base noun phrase identification. We modify our original framework (Cardie & Pierce 1998) to extract lexicalized treebank grammars that assign a score to each potential noun phrase based upon both the part-of-speech tag sequence and the word sequence of the phrase. We evaluate the modified framework on the “simple” and “c...
متن کامل